-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ValueLattice] Reset the count of range extensions when merging an undef with a constant range. #77307
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Yingwei Zheng (dtcxzyw) ChangesThis patch resets llvm-project/llvm/include/llvm/Analysis/ValueLattice.h Lines 375 to 380 in 4a45648
Fixes a regression caused by #76295. Full diff: https://github.com/llvm/llvm-project/pull/77307.diff 2 Files Affected:
diff --git a/llvm/include/llvm/Analysis/ValueLattice.h b/llvm/include/llvm/Analysis/ValueLattice.h
index 2898cdd3d7b0ca..196d7d11e13da4 100644
--- a/llvm/include/llvm/Analysis/ValueLattice.h
+++ b/llvm/include/llvm/Analysis/ValueLattice.h
@@ -406,6 +406,7 @@ class ValueLatticeElement {
if (isUnknown()) {
assert(!RHS.isUnknown() && "Unknow RHS should be handled earlier");
*this = RHS;
+ NumRangeExtensions = 0;
return true;
}
diff --git a/llvm/test/Transforms/SCCP/widen-steps-limit.ll b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
new file mode 100644
index 00000000000000..98224c4d495705
--- /dev/null
+++ b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
@@ -0,0 +1,91 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes=sccp < %s | FileCheck %s
+
+define void @test() {
+; CHECK-LABEL: define void @test() {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[TMP0:%.*]] = tail call i8 @geti8()
+; CHECK-NEXT: [[CONV_I:%.*]] = zext i8 [[TMP0]] to i32
+; CHECK-NEXT: [[CMP19_I_NOT:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br i1 [[CMP19_I_NOT]], label [[WHILE_END_I_THREAD:%.*]], label [[EXIT2:%.*]]
+; CHECK: while.end.i.thread:
+; CHECK-NEXT: [[TMP1:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
+; CHECK-NEXT: br label [[EXIT2]]
+; CHECK: exit2:
+; CHECK-NEXT: [[CMP_I12_I4:%.*]] = phi i32 [ [[TMP2]], [[WHILE_END_I_THREAD]] ], [ 0, [[ENTRY:%.*]] ]
+; CHECK-NEXT: [[CMP:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN6:%.*]], label [[IF_END_I_I:%.*]]
+; CHECK: if.then6:
+; CHECK-NEXT: [[CMP_I32:%.*]] = icmp ugt i32 [[CMP_I12_I4]], [[CONV_I]]
+; CHECK-NEXT: br i1 [[CMP_I32]], label [[EXIT:%.*]], label [[COND_FALSE_I:%.*]]
+; CHECK: cond.false.i:
+; CHECK-NEXT: [[CMP_NOT6_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br i1 [[CMP_NOT6_I_I]], label [[EXIT]], label [[FOR_BODY_PREHEADER_I_I:%.*]]
+; CHECK: for.body.preheader.i.i:
+; CHECK-NEXT: [[CMP_NOT_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: exit:
+; CHECK-NEXT: [[RETVAL_SROA_3_0_I:%.*]] = phi i32 [ 0, [[IF_THEN6]] ], [ 1, [[COND_FALSE_I]] ], [ 1, [[FOR_BODY_PREHEADER_I_I]] ]
+; CHECK-NEXT: br i1 [[CMP19_I_NOT]], label [[IF_END_I_I]], label [[DO_BODY_I_I:%.*]]
+; CHECK: do.body.i.i:
+; CHECK-NEXT: br label [[IF_END_I_I]]
+; CHECK: if.end.i.i:
+; CHECK-NEXT: ret void
+;
+entry:
+ %0 = tail call i8 @geti8()
+ %conv.i = zext i8 %0 to i32
+ %cmp19.i.not = tail call i1 @geti1()
+ br i1 %cmp19.i.not, label %while.end.i.thread, label %exit2
+
+while.end.i.thread: ; preds = %entry
+ %1 = tail call i1 @geti1()
+ %2 = zext i1 %1 to i32
+ br label %exit2
+
+exit2: ; preds = %while.end.i.thread, %entry
+ %cmp.i12.i4 = phi i32 [ %2, %while.end.i.thread ], [ 0, %entry ]
+ %cmp = tail call i1 @geti1()
+ br i1 %cmp, label %if.then6, label %if.end.i.i
+
+if.then6: ; preds = %exit2
+ %cmp.i32 = icmp ugt i32 %cmp.i12.i4, %conv.i
+ br i1 %cmp.i32, label %exit, label %cond.false.i
+
+cond.false.i: ; preds = %if.then6
+ %cmp.not6.i.i = tail call i1 @geti1()
+ br i1 %cmp.not6.i.i, label %exit, label %for.body.preheader.i.i
+
+for.body.preheader.i.i: ; preds = %cond.false.i
+ %cmp.not.i.i = tail call i1 @geti1()
+ br label %exit
+
+exit: ; preds = %for.body.preheader.i.i, %cond.false.i, %if.then6
+ %retval.sroa.3.0.i = phi i32 [ 0, %if.then6 ], [ 1, %cond.false.i ], [ 1, %for.body.preheader.i.i ]
+ br i1 %cmp19.i.not, label %if.end.i.i, label %do.body.i.i
+
+do.body.i.i: ; preds = %exit, %do.cond.i.i
+ %result.sroa.7.0.i.i = phi i32 [ %result.sroa.7.1.i.i, %do.cond.i.i ], [ %retval.sroa.3.0.i, %exit ]
+ switch i32 %result.sroa.7.0.i.i, label %do.cond.i.i [
+ i32 2, label %sw.bb.i.i
+ i32 1, label %if.end.i.i
+ i32 0, label %if.end.i.i
+ ]
+
+sw.bb.i.i: ; preds = %do.body.i.i
+ %3 = tail call i32 @geti32()
+ br label %do.cond.i.i
+
+do.cond.i.i: ; preds = %sw.bb.i.i, %do.body.i.i
+ %result.sroa.7.1.i.i = phi i32 [ %result.sroa.7.0.i.i, %do.body.i.i ], [ %3, %sw.bb.i.i ]
+ %cmp22.i.i = tail call i1 @geti1()
+ br i1 %cmp22.i.i, label %do.body.i.i, label %if.end.i.i
+
+if.end.i.i: ; preds = %do.cond.i.i, %do.body.i.i, %do.body.i.i, %exit, %exit2
+ ret void
+}
+
+declare i8 @geti8()
+declare i32 @geti32()
+declare i1 @geti1()
|
@llvm/pr-subscribers-llvm-analysis Author: Yingwei Zheng (dtcxzyw) ChangesThis patch resets llvm-project/llvm/include/llvm/Analysis/ValueLattice.h Lines 375 to 380 in 4a45648
Fixes a regression caused by #76295. Full diff: https://github.com/llvm/llvm-project/pull/77307.diff 2 Files Affected:
diff --git a/llvm/include/llvm/Analysis/ValueLattice.h b/llvm/include/llvm/Analysis/ValueLattice.h
index 2898cdd3d7b0ca..196d7d11e13da4 100644
--- a/llvm/include/llvm/Analysis/ValueLattice.h
+++ b/llvm/include/llvm/Analysis/ValueLattice.h
@@ -406,6 +406,7 @@ class ValueLatticeElement {
if (isUnknown()) {
assert(!RHS.isUnknown() && "Unknow RHS should be handled earlier");
*this = RHS;
+ NumRangeExtensions = 0;
return true;
}
diff --git a/llvm/test/Transforms/SCCP/widen-steps-limit.ll b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
new file mode 100644
index 00000000000000..98224c4d495705
--- /dev/null
+++ b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
@@ -0,0 +1,91 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes=sccp < %s | FileCheck %s
+
+define void @test() {
+; CHECK-LABEL: define void @test() {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[TMP0:%.*]] = tail call i8 @geti8()
+; CHECK-NEXT: [[CONV_I:%.*]] = zext i8 [[TMP0]] to i32
+; CHECK-NEXT: [[CMP19_I_NOT:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br i1 [[CMP19_I_NOT]], label [[WHILE_END_I_THREAD:%.*]], label [[EXIT2:%.*]]
+; CHECK: while.end.i.thread:
+; CHECK-NEXT: [[TMP1:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
+; CHECK-NEXT: br label [[EXIT2]]
+; CHECK: exit2:
+; CHECK-NEXT: [[CMP_I12_I4:%.*]] = phi i32 [ [[TMP2]], [[WHILE_END_I_THREAD]] ], [ 0, [[ENTRY:%.*]] ]
+; CHECK-NEXT: [[CMP:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN6:%.*]], label [[IF_END_I_I:%.*]]
+; CHECK: if.then6:
+; CHECK-NEXT: [[CMP_I32:%.*]] = icmp ugt i32 [[CMP_I12_I4]], [[CONV_I]]
+; CHECK-NEXT: br i1 [[CMP_I32]], label [[EXIT:%.*]], label [[COND_FALSE_I:%.*]]
+; CHECK: cond.false.i:
+; CHECK-NEXT: [[CMP_NOT6_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br i1 [[CMP_NOT6_I_I]], label [[EXIT]], label [[FOR_BODY_PREHEADER_I_I:%.*]]
+; CHECK: for.body.preheader.i.i:
+; CHECK-NEXT: [[CMP_NOT_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT: br label [[EXIT]]
+; CHECK: exit:
+; CHECK-NEXT: [[RETVAL_SROA_3_0_I:%.*]] = phi i32 [ 0, [[IF_THEN6]] ], [ 1, [[COND_FALSE_I]] ], [ 1, [[FOR_BODY_PREHEADER_I_I]] ]
+; CHECK-NEXT: br i1 [[CMP19_I_NOT]], label [[IF_END_I_I]], label [[DO_BODY_I_I:%.*]]
+; CHECK: do.body.i.i:
+; CHECK-NEXT: br label [[IF_END_I_I]]
+; CHECK: if.end.i.i:
+; CHECK-NEXT: ret void
+;
+entry:
+ %0 = tail call i8 @geti8()
+ %conv.i = zext i8 %0 to i32
+ %cmp19.i.not = tail call i1 @geti1()
+ br i1 %cmp19.i.not, label %while.end.i.thread, label %exit2
+
+while.end.i.thread: ; preds = %entry
+ %1 = tail call i1 @geti1()
+ %2 = zext i1 %1 to i32
+ br label %exit2
+
+exit2: ; preds = %while.end.i.thread, %entry
+ %cmp.i12.i4 = phi i32 [ %2, %while.end.i.thread ], [ 0, %entry ]
+ %cmp = tail call i1 @geti1()
+ br i1 %cmp, label %if.then6, label %if.end.i.i
+
+if.then6: ; preds = %exit2
+ %cmp.i32 = icmp ugt i32 %cmp.i12.i4, %conv.i
+ br i1 %cmp.i32, label %exit, label %cond.false.i
+
+cond.false.i: ; preds = %if.then6
+ %cmp.not6.i.i = tail call i1 @geti1()
+ br i1 %cmp.not6.i.i, label %exit, label %for.body.preheader.i.i
+
+for.body.preheader.i.i: ; preds = %cond.false.i
+ %cmp.not.i.i = tail call i1 @geti1()
+ br label %exit
+
+exit: ; preds = %for.body.preheader.i.i, %cond.false.i, %if.then6
+ %retval.sroa.3.0.i = phi i32 [ 0, %if.then6 ], [ 1, %cond.false.i ], [ 1, %for.body.preheader.i.i ]
+ br i1 %cmp19.i.not, label %if.end.i.i, label %do.body.i.i
+
+do.body.i.i: ; preds = %exit, %do.cond.i.i
+ %result.sroa.7.0.i.i = phi i32 [ %result.sroa.7.1.i.i, %do.cond.i.i ], [ %retval.sroa.3.0.i, %exit ]
+ switch i32 %result.sroa.7.0.i.i, label %do.cond.i.i [
+ i32 2, label %sw.bb.i.i
+ i32 1, label %if.end.i.i
+ i32 0, label %if.end.i.i
+ ]
+
+sw.bb.i.i: ; preds = %do.body.i.i
+ %3 = tail call i32 @geti32()
+ br label %do.cond.i.i
+
+do.cond.i.i: ; preds = %sw.bb.i.i, %do.body.i.i
+ %result.sroa.7.1.i.i = phi i32 [ %result.sroa.7.0.i.i, %do.body.i.i ], [ %3, %sw.bb.i.i ]
+ %cmp22.i.i = tail call i1 @geti1()
+ br i1 %cmp22.i.i, label %do.body.i.i, label %if.end.i.i
+
+if.end.i.i: ; preds = %do.cond.i.i, %do.body.i.i, %do.body.i.i, %exit, %exit2
+ ret void
+}
+
+declare i8 @geti8()
+declare i32 @geti32()
+declare i1 @geti1()
|
Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really clear to me that this change is right. How NumRangeExtensions should behave when merging lattice values isn't clear to me. The proposed behavior is asymmetric -- but maybe that's okay?
@fhahn Any comments? |
Ping @fhahn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So previously when merging in the first other lattice value, it would re-use that lattice value's NumRangeExtension, whereas now it gets set to 0. This should guarantee that we still converge as it's only set when merging with Unknown, but possibly slower.
In particular consumer-typeset
's compile-time seems to noticeably increase. Do you have any insight why that is and if we could do better? It looks like there's also a notable size change for consumer-typeset
, do you know if those are an improvement?
; CHECK: if.end.i.i: | ||
; CHECK-NEXT: ret void | ||
; | ||
entry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to reduce the test a bit further?
This patch resets
NumRangeExtensions
to zero when merging an undef with a constant range.This change also matches the behavior of
markConstantRange
:llvm-project/llvm/include/llvm/Analysis/ValueLattice.h
Lines 375 to 380 in 4a45648
Fixes a regression caused by #76295.
See also dtcxzyw/llvm-opt-benchmark#36 (comment).
Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=0e4a38018a7228d93d72a31d9fae6855f866dded&to=b599e2e59104ef7c3d230d9f7b7ebf7fed56d48d&stat=instructions%3Au